Search CORE

94 research outputs found

AcListant with Continuous Learning: Speech Recognition in Air Traffic Control (EIWAC 2019)

Author: Helmke Hartmut
Ohneiser Oliver
Rataj Jürgen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2021
Field of study

Increasing air traffic creates many challenges for air traffic management (ATM). A general answer to these challenges is to increase automation. However, communication between air traffic controllers (ATCos) and pilots is widely analog and far away from digital ATM components. As communication content is important for the ATM system, commands are still entered manually by ATCos to enable the ATM system to consider the communication. However, the disadvantage is an additional workload for the ATCos. To avoid this additional effort, automatic speech recognition (ASR) can automatically analyze the communication and extract the content of spoken commands. DLR together with Saarland University invented the AcListant® system, the first assistant based speech recognition (ABSR) with both a high command recognition rate and a low command recognition error rate. Beside the high recognition performance, AcListant® project revealed shortcomings with respect to costly adaptations of the speech recognizer to different environments. Machine learning algorithms for the automatic adaptation of ABSR to different airports were developed to counteract this disadvantage within the Single European Sky ATM Research Programme (SESAR) 2020 Exploratory Research project MALORCA. To support the standardization of speech recognition in ATM, an ontology for ATC command recognition on semantic level was developed to enable the reuse of expensively manually transcribed ATC communication in the SESAR Industrial Research project PJ.16-04. Finally, results and experiences are used in two further SESAR Wave-2 projects. This paper presents the evolution of ABSR from AcListant® via MALORCA, PJ.16-04 to SESAR Wave-2 projects

Institute of Transport Research:Publications

Early Callsign Highlighting using Automatic Speech Recognition to Reduce Air Traffic Controller Workload

Author: Helmke Hartmut
Kleinert Matthias
Ohneiser Oliver
Shetty Shruthi
Publication venue: AHFE International
Publication date: 01/07/2022
Field of study

The primary task of an air traffic controller (ATCo) is to issue instructions to pilots. However, the first contact is often initiated by the pilot. It is useful to have a controller assistance system, which could recognize and highlight the spoken callsign as early as possible, directly from the speech data. Therefore, we propose to use an automatic speech recognition (ASR) system to obtain the speech-to-text translation, using which we extract the spoken callsign. As a high callsign recognition performance is required, we use surveillance data, which significantly improves the performance. We obtain callsign recognition error rates of 6.2% and 8.3% for ATCo and pilot utterances, respectively, but can improve to 2.8% and 4.5%, when using information from surveillance dat

Institute of Transport Research:Publications

ATTENTION: TARGET AND ACTUAL – THE CONTROLLER FOCUS

Author: Marin Gabriela
Ohneiser Oliver
Postaru Roxana
Rataj Jürgen
Publication venue
Publication date: 01/01/2021
Field of study

The main task of an air traffic controller (ATCO) is to ensure safe and efficient air traffic control (ATC). Therefore, the ATCO needs to have his/her attention at the right place at the right time on the controller working position’s displays. This will be even more challenging in the future with increasing information diversity, growing levels of automation, more complex air traffic mix, new technologies, and bigger screens. However, to deal with these challenges an attention guiding assistance system is developed to support the ATCO. This system needs to determine the area of target attention due to relevant upcoming ATC events. It should also evaluate the current area of attention as a function of the ATCO's gaze, e.g., via eye-tracking, and evaluate it. If there is a mismatch between target and actual area of attention, the attention focus of the ATCO has to be appropriately guided to relevant areas via cues. Based on an analysis of attention and situation awareness, attention guidance mechanisms have been developed and successfully validated in human-in-the-loop trials. ATCOs felt well-supported by visual non-intrusive guidance cues and even wanted to have such functionality in today’s working positions

Institute of Transport Research:Publications

Validating Automatic Speech Recognition and Understanding for Pre-Filling Radar Labels - Increasing Safety While Reducing Air Traffic Controllers' Workload

Author: Ahrenhold Nils
Ehr Heiko
Helmke Hartmut
Kleinert Matthias
Mühlhausen Thorsten
Ohneiser Oliver
Publication venue: Multidisciplinary Digital Publishing Institute (MDPI)
Publication date: 05/06/2023
Field of study

Institute of Transport Research:Publications

Automatic Speech Analysis Framework for ATC Communication in HAAWAII

Author: Helmke Hartmut
Kleinert Matthias
Motlicek Petr
Nigmatulina Iuliia
Ohneiser Oliver
Prasad Amrutha
Publication venue
Publication date: 28/11/2023
Field of study

Over the past years, several SESAR funded exploratory projects focused on bringing speech and language technologies to the Air Traffic Management (ATM) domain and demonstrating their added value through successful applications. Recently ended HAAWAII project developed a generic architecture and framework, which was validated through several tasks such as callsign highlighting, pre-filling radar labels, and readback error detection. The primary goal was to support pilot and air traffic controller communication by deploying Automatic Speech Recognition (ASR) engines. Contextual information (if available) extracted from surveillance data, flight plan data, or previous communication can be exploited via entity boosting to further improve the recognition performance. HAAWAII proposed various design attributes to integrate the ASR engine into the ATM framework, often depending on concrete technical specifics of target air navigation service providers (ANSPs). This paper gives a brief overview and provides an objective assessment of speech processing components developed and integrated into the HAAWAII framework. Specifically, the following tasks are evaluated w.r.t. application domain: (i) speech activity detection, (ii) speaker segmentation and speaker role classification, as well as (iii) ASR. To our best knowledge, HAAWAII framework offers the best performing speech technologies for ATM, reaching high recognition accuracy (i.e., error-correction done by exploiting additional contextual data), robustness (i.e., models developed using large training corpora) and support for rapid domain transfer (i.e., to new ATM sector with minimum investment). Two scenarios provided by ANSPs were used for testing, achieving callsign detection accuracy of about 96% and 95% for NATS and ISAVIA, respectively

Institute of Transport Research:Publications

Brain–Computer Interface-Based Adaptive Automation to Prevent Out-Of-The-Loop Phenomenon in Air Traffic Controllers Dealing With Highly Automated Systems

Author: Aricò Pietro
Babiloni Fabio
Bagassi Sara
Berberian Bruno
Borghini Gianluca
De Crescenzio Francesca
Flumeri Gianluca,
Kramer Jan
Ohneiser Oliver
Piastra Sergio
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2019
Field of study

International audienceIncreasing the level of automation in air traffic management is seen as a measure to increase the performance of the service to satisfy the predicted future demand. This is expected to result in new roles for the human operator: he will mainly monitor highly automated systems and seldom intervene. Therefore, air traffic controllers (ATCos) would often work in a supervisory or control mode rather than in a direct operating mode. However, it has been demonstrated how human operators in such a role are affected by human performance issues, known as Out-Of-The-Loop (OOTL) phenomenon, consisting in lack of attention, loss of situational awareness and de-skilling. A countermeasure to this phenomenon has been identified in the adaptive automation (AA), i.e., a system able to allocate the operative tasks to the machine or to the operator depending on their needs. In this context, psychophysiological measures have been highlighted as powerful tool to provide a reliable, unobtrusive and real-time assessment of the ATCo’s mental state to be used as control logic for AA-based systems. In this paper, it is presented the so-called “Vigilance and Attention Controller”, a system based on electroencephalography (EEG) and eye-tracking (ET) techniques, aimed to assess in real time the vigilance level of an ATCo dealing with a highly automated human–machine interface and to use this measure to adapt the level of automation of the interface itself. The system has been tested on 14 professional ATCos performing two highly realistic scenarios, one with the system disabled and one with the system enabled. The results confirmed that (i) long high automated tasks induce vigilance decreasing and OOTL-related phenomena; (ii) EEG measures are sensitive to these kinds of mental impairments; and (iii) AA was able to counteract this negative effect by keeping the ATCo more involved within the operative task. The results were confirmed by EEG and ET measures as well as by performance and subjective ones, providing a clear example of potential applications and related benefits of AA

Institute of Transport Research:Publications

Scipedia

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Archivio della ricerca- Università di Roma La Sapienza

Hal-Diderot

Customization of Automatic Speech Recognition Engines for Rare Word Detection Without Costly Model Re-Training

Author: Bhattacharjee Mrinmoy
Ehr Heiko
Helmke Hartmut
Kleinert Matthias
Motlicek Petr
Nigmatulina Iuliia
Ohneiser Oliver
Publication venue
Publication date: 28/11/2023
Field of study

Thanks to Alexa, Siri or Google Assistant automatic speech recognition (ASR) has changed our daily life during the last decade. Prototypic applications in the air traffic management (ATM) domain are available. Recently pre-filling radar label entries by ASR support has reached the technology readiness level before industrialization (TRL6). However, seldom spoken and airspace related words relevant in the ATM context remain a challenge for sophisticated applications. Open-source ASR toolkits or large pre-trained models for experts - allowing to tailor ASR to new domains - can be exploited with a typical constraint on availability of certain amount of domain specific training data, i.e., typically transcribed speech for adapting acoustic and/or language models. In general, it is sufficient for a "universal" ASR engine to reliably recognize a few hundred words that form the vocabulary of the voice communications between air traffic controllers and pilots. However, for each airport some hundred dependent words that are seldom spoken need to be integrated. These challenging word entities comprise special airline designators and waypoint names like "dexon" or "burok", which only appear in a specific region. When used, they are highly informative and thus require high recognition accuracies. Allowing plug and play customization with a minimum expert manipulation assumes that no additional training is required, i.e., fine-tuning the universal ASR. This paper presents an innovative approach to automatically integrate new specific word entities to the universal ASR system. The recognition rate of these region-specific word entities with respect to the universal ASR increases by a factor of 6

Institute of Transport Research:Publications

Ensuring Safety for Artificial-Intelligence-Based Automatic Speech Recognition in Air Traffic Control Environment

Author: Dokic Jelena
Garcia Lasheras Raquel
Hartikainen Petri
Helmke Hartmut
Ohneiser Oliver
Pinska-Chauvin Ella
Publication venue: Multidisciplinary Digital Publishing Institute (MDPI)
Publication date: 01/11/2023
Field of study

This paper describes the safety assessment conducted in SESAR2020 project PJ.10-W2-96 ASR on automatic speech recognition (ASR) technology implemented for air traffic control (ATC) centers. ASR already now enables the automatic recognition of aircraft callsigns and various ATC commands including command types based on controller–pilot voice communications for presentation at the controller working position. The presented safety assessment process consists of defining design requirements for ASR technology application in normal, abnormal, and degraded modes of ATC operations. A total of eight functional hazards were identified based on the analysis of four use cases. The safety assessment was supported by top-down and bottom-up modelling and analysis of the causes of hazards to derive system design requirements for the purposes of mitigating the hazards. Assessment of achieving the specified design requirements was supported by evidence generated from two real-time simulations with pre-industrial ASR prototypes in approach and en-route operational environments. The simulations, focusing especially on the safety aspects of ASR application, also validated the hypotheses that ASR reduces controllers’ workload and increases situational awareness. The missing validation element, i.e., an analysis of the safety effects of ASR in ATC, is the focus of this paper. As a result of the safety assessment activities, mitigations were derived for each hazard, demonstrating that the use of ASR does not increase safety risks and is, therefore, ready for industrialization

Institute of Transport Research:Publications

Directory of Open Access Journals

Grammar Based Speaker Role Identification for Air Traffic Control Speech Recognition

Author: Helmke Hartmut
Motlicek Petr
Nigmatulina Iuliia
Ohneiser Oliver
Prasad Amrutha
Sarfjoo Saeed
Zuluaga-Gomez Juan Pablo
Publication venue
Publication date: 01/01/2022
Field of study

Automatic Speech Recognition (ASR) for air traffic control is generally trained by pooling Air Traffic Controller (ATCO) and pilot data. In practice, this is motivated by the proportion of annotated data from pilots being less than ATCO’s. However, due to the data imbalance of ATCO and pilot and their varying acoustic conditions, the ASR performance is usually significantly better for ATCOs speech than pilots. Obtaining the speaker roles requires manual effort when the voice recordings are collected using Very High Frequency (VHF) receivers and the data is noisy and in a single channel without the push-totalk (PTT) signal. In this paper, we propose to (1) split the ATCO and pilot data using an intuitive approach exploiting ASR transcripts and (2) consider ATCO and pilot ASR as two separate tasks for Acoustic Model (AM) training. The paper focuses on applying this approach to noisy data collected using VHF receivers, as this data is helpful for training despite its noisy nature. We also developed a simple yet efficient knowledgebased system for speaker role classification based on grammar defined by the International Civil Aviation Organization (ICAO). Our system accepts as input text, thus, either gold annotations or transcripts generated by an ABSR system. This approach provides an average accuracy in speaker role identification of 83%. Finally, we show that training AMs separately for each task, or using a multitask approach, is well suited for the noisy data compared to the traditional ASR system, where all data is pooled together for AM training

Institute of Transport Research:Publications

How to Measure Speech Recognition Performance in the Air Traffic Control Domain? The Word Error Rate is only half of the truth

Author: Cerna Aneta
Helmke Hartmut
Kleinert Matthias
Motlice Petr
Ohneiser Oliver
Prasad Amrutha
Shetty Shruthi
Windisch Christian
Publication venue
Publication date: 01/01/2021
Field of study

Applying Automatic Speech Recognition (ASR) in the domain of analogue voice communication between air traffic controllers (ATCo) and pilots has more end user requirements than just transforming spoken words into text. It is useless, when word recognition is perfect, as long as the semantic interpretation is wrong. For an ATCo it is of no importance if the words of greeting are correctly recognized. A wrong recognition of a greeting should, however, not disturb the correct recognition of e.g. a “descend” command. Recently, 14 European partners from Air Traffic Management (ATM) domain have agreed on a common set of rules, i.e., an ontology on how to annotate the speech utterance of an ATCo. This paper first extends the ontology to pilot utterances and then compares different ASR implementations on semantic level by introducing command recognition, command recognition error, and command rejection rates. The implementation used in this paper achieves a command recognition rate better than 94% for Prague Approach, even when WER is above 2.5

Institute of Transport Research:Publications